LEARN RSA Masterplan — Full Build (Embedded)

This is the full, embedded master document. All steps are included inline for a single‑file presentation.


PART I — BACKGROUND + LITERATURE FOUNDATION

Step 1 — Annotated Background + Literature Map (Deep Dive)

This section is the foundational narrative for the project. It translates the proposal, internal presentations, and notes into a coherent background with explicit links to repo sources. The goal is a high‑signal literature map that motivates the analyses.


1) Core Research Questions (Framing)

Primary questions

  1. Do adolescents’ neural representations of peers become organized according to the true peer structure as learning unfolds?
  2. Are those representations more idiosyncratic (less group‑aligned) in youth with higher social anxiety?

Source anchors


2) Task Context: From Virtual School Task to LEARN

Virtual School Task (VST):

LEARN Task (modified VST):

Why this matters:


3) Social Anxiety Context + Measures

Clinical context:

Measures used in this project:


4) Learning Dynamics: Prior Findings + Relevance

Why learning matters here:

Computational context (from internal materials):

Bridge to RSA:


5) Papers Presentation Deep Dive (Grant + Task + Modeling + Neural Findings)

This is the core internal evidence base for the LEARN task, sample, modeling approach, and prior findings. The goal here is to extract methods, scales, and results that directly inform RSA design.

5.1 F31 Grant (Clarkson_F31_resub_080819.pdf)

Relevance: This defines the computational learning architecture that LEARN RSA must complement (representational geometry layered on top of learning rates).

5.2 LEARN Task Slides (Learn.pptx)

Relevance: Provides the explicit RSA hypothesis set and peer‑level averaging logic that is the immediate basis for your model‑RDM construction with current averaged betas.

5.2a Trial Outcomes + Prediction Error Types (from LEARN slides)

5.2b RSA Hypotheses from LEARN slides

5.2c Context Labels Used in Modeling

From the model validation slides, peer contexts are labeled:

These labels reappear in model‑validation figures and should be used when organizing RSA outputs by context.

5.3 Clarkson Defense (Clarkson_Defense_2.0.pptx)

5.3a Model taxonomy (M1–M10)

5.3b Parsimony + model comparison

5.3c Neural analysis notes

5.3d ROI‑specific PE pattern (from LEARN slides)

Relevance: Establishes which model‑based signals are meaningful and which ROIs show SA effects, guiding RSA ROI prioritization and model‑RDM choice.

5.4 MS Figures (MS_figures_all_052325_jj[27].pptx)

Relevance: These figures anchor the specific contextual axes (valence, predictability) that should be encoded into model‑RDMs.

5.5 Clarkson Manuscript (Clarkson_2024_manuscript…docx)

Relevance: The manuscript establishes that learning dynamics differ by SA, motivating RSA to test whether representational structure also differs.

5.6 Takeaways for RSA Design

5.7 Papers Presentation Summary Table

5.8 Bridges to RSA Implementation

| Source | Sample/Population | Methods/Models | Key Findings/Outputs | RSA Implication | |---|---|---|---|---| | Learn.pptx | 47 adolescents (10–15); SA cutoff ≥7 | LEARN task + RSA hypotheses | Disposition/Predictability/Negativity RDMs; peer‑level averaging | Directly defines model‑RDMs | | Clarkson_Defense_2.0.pptx | Same cohort | M1–M10 learning models; PE‑based fMRI | SA: faster adjustments; reputation‑based PE effects in vmPFC/dACC/insula/vStr | ROI + model‑RDM selection | | MS_figures_all_052325_jj[27].pptx | Same cohort | Model validation + PE decomposition | Learning rate + associative value differences by context + SA | Encode context axes in RDMs | | Clarkson_2024 manuscript | ~47 youth | Computational modeling + fMRI | Higher associability for unexpected negative feedback | Supports SA‑linked learning dynamics | | Clarkson_F31 grant | N≈60 (10–15) | Computational models + connectivity | Learning parameters + FC approach | Computational layer to compare with RSA |

6) RSA Papers Map (Methods + Findings + Relevance)

This section extracts methodology + findings from the RSA paper folder and links each paper to the LEARN RSA aims.

6.1 Greco et al., 2024 — Predictive learning shapes representational geometry

File: Predictive learning shapes the representational geometry of the human brain _ Nature Communications.pdf
Methodology (from abstract): MEG during listening to acoustic sequences with different regularities.
Key finding: Representational geometry aligns to the statistical structure of the environment; clustering of predictable stimuli; alignment magnitude correlates with prediction‑error encoding.
Relevance to LEARN: Direct precedent for model‑RDM alignment logic—learning reorganizes geometry to match true structure.

6.2 Finn et al., 2020 — Idiosynchrony / IS‑RSA

File: nihms-1585696.pdf
Methodology (from abstract): Review + framework paper introducing inter‑subject representational similarity analysis (IS‑RSA), demonstrated using naturalistic movie data (HCP).
Key finding: IS‑RSA recovers brain‑behavior relationships by quantifying idiosyncratic vs shared neural responses.
Relevance to LEARN: Methodological foundation for idiosyncrasy analysis (Anna Karenina approach).

6.3 Baek et al., 2023 — Lonely individuals process the world in idiosyncratic ways

File: Lonely individuals process the world in idiosynractic ways.pdf
Methodology (from abstract/methods): fMRI of first‑year students; naturalistic stimuli; measure alignment of neural responses across individuals.
Key finding: Lonelier individuals show less shared neural responses, especially in default‑mode regions; effect persists controlling demographics and social ties.
Relevance to LEARN: Supports the idea that social disconnection ↔ idiosyncrasy, grounding the SA idiosyncrasy hypothesis.

6.4 Shen et al., 2025 — Neural similarity predicts who becomes friends

File: neuralsimpredictswhobecomesfriends.pdf
Methodology (from methods snippet): fMRI responses to stimuli; social network mapped over time (Time 1 → Time 2/3).
Key finding: Pre‑existing neural similarity predicts later friendship proximity and trajectories.
Relevance to LEARN: Establishes functional significance of shared neural geometry for real‑world social bonding.

6.5 Camacho et al., 2024 — Higher inter‑subject variability in youth with higher SA

File: nihms-2066703.pdf
Methodology (from abstract): Healthy Brain Network (N≈740; ages 5–15), naturalistic movies; tested mean activity and inter‑subject variability vs SCARED.
Key finding: No mean differences, but higher inter‑subject variability in high‑SA youth (posterior cingulate, supramarginal, IFG).
Relevance to LEARN: Direct evidence that SA relates to neural variability, supporting idiosyncrasy predictions.

6.6 Lamba et al., 2020 — Anxiety impedes adaptive social learning under uncertainty

File: lamba-et-al-2020-anxiety-impedes-adaptive-social-learning-under-uncertainty.pdf
Methodology (from abstract): Dynamic trust game + matched nonsocial task; computational modeling of learning under uncertainty.
Key finding: Anxious participants over‑invest in exploitative partners; modeling suggests reduced learning from negative social events and failure to scale learning with uncertainty.
Relevance to LEARN: Anchors the uncertainty‑learning angle; motivates testing learning dynamics in SA with a controlled social feedback task.


6.7 RSA Papers — Methods/Findings Matrix

| Paper | Paradigm | Sample | Analysis Type | Key Finding | Direct Link to LEARN | |---|---|---|---|---|---| | Greco 2024 | MEG, auditory sequences | Human adults | RSA on representational geometry | Geometry aligns to statistical structure; linked to PE encoding | Supports model‑RDM alignment across learning | | Finn 2020 | Naturalistic fMRI (movies) | HCP | IS‑RSA framework | Idiosynchrony captures brain‑behavior relations | Method backbone for idiosyncrasy | | Baek 2023 | Naturalistic fMRI | 66 first‑year students | Inter‑subject similarity | Lonelier people show less shared neural responses | Social disconnection ↔ idiosyncrasy | | Shen 2025 | fMRI + social network | Cohort over time | Neural similarity vs friendship distance | Pre‑existing similarity predicts later friendship | Shared geometry predicts social bonding | | Camacho 2024 | Naturalistic movies | N≈740 youth | Mean activation vs variability | SA ↔ higher inter‑subject variability | SA‑linked idiosyncrasy in youth | | Lamba 2020 | Trust game + nonsocial | n≈400 | Computational learning under uncertainty | Anxiety reduces learning from negative social events | Motivates uncertainty‑learning axis |

6.8 Methodological Takeaways for LEARN RSA

6.9 RSA Papers Micro‑Summaries (Methods → Findings → LEARN Link)

These are tight, method‑level summaries of the RSA papers with explicit bridges to LEARN RSA.

Greco 2024 — Predictive learning shapes representational geometry

Finn 2020 — Idiosynchrony (IS‑RSA framework)

Baek 2023 — Lonely individuals process the world in idiosyncratic ways

Shen 2025 — Neural similarity predicts friendship

Camacho 2024 — SA and inter‑subject variability

Lamba 2020 — Anxiety impedes adaptive social learning


6.10 RSA‑to‑LEARN Bridges (Explicit)

Bridge 1: Geometry alignment

Bridge 2: Idiosyncrasy

Bridge 3: Social consequence

Bridge 4: Uncertainty learning


7) Social Learning Papers (Theory + Method Context)

These papers in Learn/Social Learning provide theoretical framing for how people learn about others, update impressions, and use conceptual structure to simplify social learning.

7.1 Hackel et al. — Simplifying Social Learning (Opinion)

Files: Learn/Social Learning/Learning Model.pdf, Learn/Social Learning/Simplifying.pdf

7.2 Mende‑Siedlecki — Dynamic Impression Updating

File: Learn/Social Learning/LearningStyle.pdf

7.3 He et al. 2025 — mPFC Linking Mentalizing + Attachment Schemas

File: Learn/Social Learning/Nim_Tot.pdf

7.4 Methodological Takeaways for LEARN


8) Proposal vs. Papers: Where This Project Sits

Project_Proposal.docx proposes:

How papers map onto it:

Net: the proposal sits at the intersection of social learning computation and representational geometry, extending those ideas into a clinically relevant adolescent population using RSA.


9) RSA Rationale: Why Representation (Not Just Activation)

RSA captures relational geometry, not just amplitude. It asks: Which conditions look similar to each other in neural pattern space? [Project_Proposal.docx; RSA_notes.docx]

Representational learning premise:

Internal RSA notes emphasize:


10) Idiosyncrasy: Why Group Alignment Matters

Concept:

Key implication:


11) Candidate Brain Systems (From Repo)

Core ROIs specified in the proposal:

Extended mentalizing network (ROI notes):

Why this matters:


12) Internal Source Map (What Each Folder Contributes)

Proposal + Core Theory

Task Structure + Timing

Social Learning Prior Findings / Context

RSA Method Notes

ROI Justifications

Smoothing Rationale

Source Code Repos


13) Background Summary (Narrative Paragraph)

Adolescence is a sensitive period for social evaluation and learning. Socially anxious youth show altered expectations and responses to social feedback, yet how their brains represent the structure of their social world remains unclear. The LEARN task provides a controlled environment where peers vary along hidden dimensions of disposition and predictability, requiring participants to build internal models of peer behavior. RSA is uniquely suited to quantify whether neural representational geometry aligns with this true structure over time and whether those representations become more idiosyncratic in higher social anxiety. Core ROIs spanning valuation and salience (vmPFC, dACC, insula, striatum) and extended mentalizing regions (mPFC, TPJ, temporal pole, precuneus) offer a biologically grounded substrate for both learning and idiosyncrasy hypotheses.


14) Output of Step 1

This step produces:

Next: Step 2 formalizes the task into exact data schemas and beta requirements.


PART II — TASK FORMALIZATION + SCHEMAS

Step 2 — LEARN Task Formalization + Data Schemas + Beta Requirements

This section formalizes the LEARN task into exact condition schemas, data tables, and beta requirements, so the RSA pipeline can be implemented without ambiguity.


1) Task Structure (Formal Definition)

Entities

Peer structure

Trial epochs


2) Condition Schema (Canonical Labels)

2.1 Peer labels and context

| Peer | Disposition | Predictability | Context label | |---|---|---|---| | P1 | Nice | Predictable | Npred | | P2 | Nice | Unpredictable | Nunpred | | P3 | Mean | Predictable | Mpred | | P4 | Mean | Unpredictable | Munpred |

2.2 Trial outcome labels

| Prediction | Feedback | Accuracy | PE Type | |---|---|---|---| | Nice | Nice | Correct | No PE (positive feedback) | | Mean | Mean | Correct | No PE (negative feedback) | | Nice | Mean | Incorrect | Negative PE | | Mean | Nice | Incorrect | Positive PE |


3) Data Schema (Long‑form Master Table)

This is the canonical schema for trial‑level data. It should be the backbone for all downstream RSA and modeling.

| subject | run | trial | peer | disp | pred | prediction | feedback | accuracy | pe_type | valence | rt | beta_path | |---|---|---|---|---|---|---|---|---|---|---|---|---| | S001 | 1 | 1 | P1 | Nice | Pred | Nice | Nice | 1 | none | pos | 1.2 | /path/... |

Notes:


3.1 Data Sources + Subject‑Level Merge (Behavioral / Survey / Demographics)

Linux source paths (auditing key):

Subject‑level merged table (local repo artifact):


4) Beta Requirements Matrix (Explicit)

| Analysis | Minimal Beta Level | Condition Count | Feasible with current betas? | |---|---|---|---| | Collapsed model‑RDM | Subject × ROI × Peer | 4 | Yes | | Collapsed Peer×Feedback RDM | Subject × ROI × Peer×Valence | 8 | Yes | | Run‑wise model‑RDM | Subject × ROI × Run × Peer | 16 (4 peers × 4 runs) | No | | Run‑wise Peer×Feedback | Subject × ROI × Run × Peer×Valence | 32 | No | | Trial‑wise RSA / PE models | Subject × ROI × Trial | 128 | No |


5) Data‑Ready Schemas (If only averaged betas)

5.1 Current betas: Peer × FeedbackValence

One subject, one ROI:

| condition | meaning | |---|---| | P1_pos | Peer1, positive feedback | | P1_neg | Peer1, negative feedback | | P2_pos | Peer2, positive feedback | | P2_neg | Peer2, negative feedback | | P3_pos | Peer3, positive feedback | | P3_neg | Peer3, negative feedback | | P4_pos | Peer4, positive feedback | | P4_neg | Peer4, negative feedback |

5.2 Aggregation logic (for idiosyncrasy)


6) Beta Manifest Templates

6.1 Minimal (current data)

subject,roi,peer,valence,beta_path
S001,vmPFC,P1,pos,/path/S001_vmPFC_P1_pos.nii.gz
S001,vmPFC,P1,neg,/path/S001_vmPFC_P1_neg.nii.gz
...

6.2 Run‑wise (future data)

subject,roi,run,peer,beta_path
S001,vmPFC,1,P1,/path/S001_vmPFC_run1_P1.nii.gz
S001,vmPFC,1,P2,/path/S001_vmPFC_run1_P2.nii.gz
...

6.3 Trial‑wise (future beta series)

subject,roi,run,trial,peer,valence,pe_type,beta_path
S001,vmPFC,1,1,P1,pos,none,/path/S001_vmPFC_run1_trial1.nii.gz
...

7) Condition Maps → Model RDMs

7.1 Peer model (4 conditions)

# P1..P4 peer model
peer_features = np.array([
    [1,1],  # Npred
    [1,0],  # Nunpred
    [0,1],  # Mpred
    [0,0],  # Munpred
])

7.2 Peer×Feedback model (8 conditions)

conditions = ["P1_pos","P1_neg","P2_pos","P2_neg","P3_pos","P3_neg","P4_pos","P4_neg"]
valence = np.array([1,0,1,0,1,0,1,0])
peer_id = np.array([1,1,2,2,3,3,4,4])

rdm_feedback = np.abs(valence[:,None]-valence[None,:])
rdm_peer = (peer_id[:,None]!=peer_id[None,:]).astype(int)

8) Visual Diagrams (Conceptual)

8.1 Peer structure

Nice:     P1 (pred)   P2 (unpred)
Mean:     P3 (pred)   P4 (unpred)

8.2 Trial sequence

Prediction (4s) → Feedback (3s) → Response (4s)

9) Outputs of Step 2

Next: Step 3 builds model‑RDM suite and full vectorization logic.


PART III — MODEL‑RDM SUITE

Step 3 — Model‑RDM Suite (Learning) — Deep, Commented, LEARN‑Specific

This section gives fully worked, commented code that builds the exact model RDMs you need, in the exact structure implied by LEARN.

Three required model families

  1. Peer similarity (4 peers only)
  2. Feedback similarity (+ vs −)
  3. Peer × Feedback similarity (idealized matrix for all 8 conditions)

0) Definitions and Conventions

Peers (canonical order):

Feedback valence:

Peer × Feedback conditions (8 total, fixed order):

P1_pos, P1_neg, P2_pos, P2_neg, P3_pos, P3_neg, P4_pos, P4_neg

1) Peer‑Level Model RDMs (4×4)

These models operate on 4 peer conditions only (no feedback split).

1.1 Peer Disposition RDM (Nice vs Mean)

Peers cluster by valence.

import numpy as np
from scipy.spatial.distance import pdist, squareform

# Nice=1, Mean=0
valence = np.array([1, 1, 0, 0]).reshape(-1, 1)

# Euclidean distance gives 0 if same valence, 1 if different
rdm_disp = squareform(pdist(valence, metric="euclidean"))
print(rdm_disp)

1.2 Peer Predictability RDM (Pred vs Unpred)

Peers cluster by predictability.

# Pred=1, Unpred=0
pred = np.array([1, 0, 1, 0]).reshape(-1, 1)
rdm_pred = squareform(pdist(pred, metric="euclidean"))
print(rdm_pred)

1.3 Peer Combined RDM (Disposition + Predictability)

Captures both dimensions simultaneously.

peer_features = np.array([
    [1,1],  # P1 Npred
    [1,0],  # P2 Nunpred
    [0,1],  # P3 Mpred
    [0,0],  # P4 Munpred
])

rdm_combo = squareform(pdist(peer_features, metric="euclidean"))
print(rdm_combo)

1.4 Negativity‑Weighted RDM

Explicitly encodes negative‑bias: mean peers more similar to each other than nice peers.

# Hand‑built negativity‑weighted dissimilarity
rdm_neg = np.array([
    [0,   0.5, 1, 1],
    [0.5, 0,   1, 1],
    [1,   1,   0, 0],
    [1,   1,   0, 0],
])
print(rdm_neg)

2) Feedback‑Only Model RDM (8×8)

This model ignores peer identity and groups conditions only by feedback valence.

import numpy as np

conditions = [
    "P1_pos", "P1_neg",
    "P2_pos", "P2_neg",
    "P3_pos", "P3_neg",
    "P4_pos", "P4_neg",
]

# 1=pos, 0=neg
valence = np.array([1,0, 1,0, 1,0, 1,0])

# 0 if same valence, 1 if different
rdm_feedback = np.abs(valence[:,None] - valence[None,:])

print(rdm_feedback)

3) Peer×Feedback Model RDM (8×8) — FULLY EXPLICIT

This is the complex model you asked for: an idealized similarity matrix where similarity depends on both peer identity and feedback valence.

3.1 Building Blocks

We build the peer×feedback model as a weighted sum of three components:

  1. Peer similarity matrix (same peer = 0, different peer = 1)
  2. Feedback similarity matrix (same valence = 0, different = 1)
  3. Contextual similarity matrix (valence × predictability × disposition relationships)

3.2 Step‑by‑Step Construction (commented)

import numpy as np

# --- 1) Define condition labels and features ---
conditions = [
    "P1_pos", "P1_neg",
    "P2_pos", "P2_neg",
    "P3_pos", "P3_neg",
    "P4_pos", "P4_neg",
]

# Peer identity per condition
peer_id = np.array([1,1, 2,2, 3,3, 4,4])

# Feedback valence per condition
valence = np.array([1,0, 1,0, 1,0, 1,0])  # pos=1, neg=0

# Disposition and predictability per peer
# P1=Npred, P2=Nunpred, P3=Mpred, P4=Munpred
peer_disp = {1:1, 2:1, 3:0, 4:0}  # nice=1, mean=0
peer_pred = {1:1, 2:0, 3:1, 4:0}  # pred=1, unpred=0

# Expand to condition level
disp = np.array([peer_disp[i] for i in peer_id])
pred = np.array([peer_pred[i] for i in peer_id])

# --- 2) Build base matrices ---
# Peer similarity (0 same peer, 1 different peer)
rdm_peer = (peer_id[:,None] != peer_id[None,:]).astype(int)

# Feedback similarity (0 same valence, 1 different)
rdm_feedback = np.abs(valence[:,None] - valence[None,:])

# Disposition similarity (nice vs mean)
rdm_disp = np.abs(disp[:,None] - disp[None,:])

# Predictability similarity (pred vs unpred)
rdm_pred = np.abs(pred[:,None] - pred[None,:])

# --- 3) Combine into a full Peer×Feedback model ---
# Weighted sum (weights can be tuned or compared)
# Example weights: peer identity matters most; feedback matters second; context matters third
w_peer = 0.5
w_fb   = 0.3
w_ctx  = 0.2

rdm_peer_feedback = (w_peer * rdm_peer) + (w_fb * rdm_feedback) + (w_ctx * (rdm_disp + rdm_pred)/2)

print(rdm_peer_feedback)

3.3 Interpretation


4) Vectorization (All Models)

Every RDM is vectorized using lower triangle (k=-1).

# 4×4 vectorization
tri4 = np.tril_indices(4, k=-1)
vec_disp = rdm_disp[tri4]
vec_pred = rdm_pred[tri4]
vec_combo = rdm_combo[tri4]
vec_neg = rdm_neg[tri4]

# 8×8 vectorization
tri8 = np.tril_indices(8, k=-1)
vec_feedback = rdm_feedback[tri8]
vec_peer_fb  = rdm_peer_feedback[tri8]

5) Model Regression (Comparing Multiple RDMs)

from sklearn.linear_model import LinearRegression

# Example: regress neural RDM on multiple model RDMs
Y = neural_rdm[tri8]
X = np.vstack([
    rdm_feedback[tri8],
    rdm_peer[tri8],
    rdm_peer_feedback[tri8],
]).T

reg = LinearRegression().fit(X, Y)
print(reg.coef_)  # weights for each model

6) Summary Output of Step 3

Next: Step 4 builds the Idiosyncrasy (IS‑RSA) suite with validation and SA‑linked modeling.


PART IV — IDIOSYNCRASY SUITE

Step 4 — Idiosyncrasy (IS‑RSA) Suite — Deep, Commented, LEARN‑Specific

This section defines the idiosyncrasy analysis in detail, with full code templates that work with your current averaged betas and scale to run‑wise or trial‑wise data later.


1) Concept: What Idiosyncrasy Means Here


2) Input Data Structures (Current vs Future)

2.1 Current betas (averaged)

2.2 Future betas (run‑wise)


3) Build Subject Pattern Matrices

3.1 Current data (averaged betas)

import numpy as np

# patterns_pos: subjects x voxels
# patterns_neg: subjects x voxels

# Example placeholder shapes
# patterns_pos = np.random.randn(n_subjects, n_voxels)
# patterns_neg = np.random.randn(n_subjects, n_voxels)

3.2 Run‑wise extension

# patterns_pos[run]: subjects x voxels
# patterns_neg[run]: subjects x voxels

4) Inter‑Subject Similarity and Idiosyncrasy

from scipy.spatial.distance import pdist, squareform

# Similarity matrix across subjects
# correlation distance → similarity

def similarity_matrix(patterns):
    d = pdist(patterns, metric="correlation")
    sim = 1 - squareform(d)
    return sim

# Idiosyncrasy score per subject
# lower similarity to others = higher idiosyncrasy

def idiosyncrasy_score(patterns):
    sim = similarity_matrix(patterns)
    return 1 - sim.mean(axis=1)

idio_pos = idiosyncrasy_score(patterns_pos)
idio_neg = idiosyncrasy_score(patterns_neg)

5) Valence × SA Statistical Model

5.1 Long‑form data assembly

import pandas as pd

# Example: build a long-form dataframe
subjects = ["S001","S002"]

rows = []
for i, s in enumerate(subjects):
    rows.append({"subject": s, "valence": "pos", "idio": idio_pos[i]})
    rows.append({"subject": s, "valence": "neg", "idio": idio_neg[i]})

df = pd.DataFrame(rows)

5.2 Mixed effects model (Python)

import statsmodels.formula.api as smf

# df columns: subject, valence, idio, SA
# model = smf.mixedlm("idio ~ valence * SA", df, groups=df["subject"]).fit()
# print(model.summary())

5.3 Mixed effects model (R)

# lmer(idio ~ valence * SA + (1|subject), data=df)

6) Validation and Control Analyses

6.1 Split‑half reliability

# Split trials into odd/even
# patterns_pos_split1, patterns_pos_split2
# reliability = spearmanr(sim1[tri], sim2[tri])[0]

6.2 Permutation test

import numpy as np
from scipy.stats import spearmanr

def perm_test_idio(patterns, n=1000):
    sim = similarity_matrix(patterns)
    obs = sim.mean()
    null = []
    for _ in range(n):
        perm = np.random.permutation(patterns.shape[0])
        sim_perm = similarity_matrix(patterns[perm])
        null.append(sim_perm.mean())
    p = (np.sum(np.array(null) >= obs) + 1) / (n + 1)
    return obs, p

7) Interpretation Logic


8) Output of Step 4

Next: Step 5 builds end‑to‑end pipeline from betas → ROI → RDMs → model fit → statistics.


PART V — END‑TO‑END PIPELINE

Step 5 — End‑to‑End Pipeline (Ultra‑Deep Version)

This file provides a complete, expandable pipeline with QA, validation, plotting, run‑wise and trial‑wise hooks, and reporting. It is designed to be swapped to real paths later with minimal changes.


0A) RSA‑learn Beta Generation (Run‑wise + Collapsed)

Goal: regenerate first‑level betas in a new output root, with per‑run peer×feedback betas plus peer‑only and feedback‑only betas, and then collapsed‑across‑runs versions of those same contrasts.

RSA‑learn output root (new): /Users/dannyzweben/Desktop/SDN/Y1_project/fmri-data/LEARN_share/RSA-learn

Current beta provenance (existing pipeline):

  1. Timing generator: /Users/dannyzweben/Desktop/SDN/Y1_project/fmri-data/LEARN_share/code/afni/LEARN_1D_AFNItiming_Full.sh
  2. GLM spec: /Users/dannyzweben/Desktop/SDN/Y1_project/fmri-data/LEARN_share/code/afni/LEARN_ap_Full_all.sh
  3. Per‑subject execution script: /Users/dannyzweben/Desktop/SDN/Y1_project/fmri-data/LEARN_share/derivatives/afni/IndvlLvlAnalyses/<SUBJ>/proc.<SUBJ>.LEARN_070422
  4. Output bucket: /Users/dannyzweben/Desktop/SDN/Y1_project/fmri-data/LEARN_share/derivatives/afni/IndvlLvlAnalyses/<SUBJ>/<SUBJ>.results.LEARN_070422/stats.<SUBJ>+tlrc.*

Inputs already present for GLM reruns (per subject):

  1. Preprocessed data per run: pb02.<SUBJ>.r01.scale+tlrcpb02.<SUBJ>.r04.scale+tlrc in each *.results.LEARN_070422 folder
  2. Motion regressors: motion_demean.1D, motion_deriv.1D, sub-<SUBJ>_task-learn_allruns_motion.1D
  3. Event files (BIDS): sub-<SUBJ>_task-learn_run-0X_events.tsv in code/afni/TimingFiles/Full/sub-<SUBJ>/
  4. Existing parametric timing files (for reference): Mean60_fdkm.1D, Mean60_fdkm_run1.txt, etc.

Run‑wise redesign: what changes

  1. Create NonPM run‑wise timing files (one file per run and condition) from events.tsv.
  2. Expand 3dDeconvolve to include run‑specific regressors (one per condition per run).
  3. Add GLTs for peer‑only and feedback‑only per run and across runs.
  4. Save outputs to RSA-learn/derivatives/afni/IndvlLvlAnalyses/ to keep pipelines separate.

Example: NonPM run‑wise timing generation (Python)

import pandas as pd
from pathlib import Path

subj = "1055"
base = Path("/Users/dannyzweben/Desktop/SDN/Y1_project/fmri-data/LEARN_share/code/afni/TimingFiles/Full")
run = 1
cond = "Mean_60_fdkm"  # peer×feedback condition

events = base / f"sub-{subj}" / f"sub-{subj}_task-learn_run-0{run}_events.tsv"
df = pd.read_csv(events, sep="	")
rows = df[df["event"] == cond]
line = " ".join(f"{o:.3f}:{d:.3f}" for o, d in zip(rows["onset"], rows["duration"]))

out = base / f"sub-{subj}" / f"NonPM_{cond}_run{run}.1D"
out.write_text(line + "
")

Example: run‑wise regressors in AFNI (concept)

# FBM Mean60, run 1–4 (NonPM)
-stim_times_AM1 1 stimuli/offset_NonPM_Mean60_fdkm_run1.1D 'dmBLOCK(0)'
-stim_times_AM1 2 stimuli/offset_NonPM_Mean60_fdkm_run2.1D 'dmBLOCK(0)'
-stim_times_AM1 3 stimuli/offset_NonPM_Mean60_fdkm_run3.1D 'dmBLOCK(0)'
-stim_times_AM1 4 stimuli/offset_NonPM_Mean60_fdkm_run4.1D 'dmBLOCK(0)'
-stim_label 1 FBM.Mean60.r1
-stim_label 2 FBM.Mean60.r2
-stim_label 3 FBM.Mean60.r3
-stim_label 4 FBM.Mean60.r4

Example: peer‑only GLT per run

-gltsym 'SYM: +FBM.Mean60.r1 +FBM.Mean80.r1 +FBM.Nice60.r1 +FBM.Nice80.r1'
-glt_label 1 FBM.r1

Example: feedback‑only GLT per run

-gltsym 'SYM: +FBM.Nice60.r2 +FBN.Nice60.r2 +FBM.Nice80.r2 +FBN.Nice80.r2'
-glt_label 2 NICE.r2

Deliverables to verify

  1. Per‑run betas: 8 peer×feedback × 4 runs
  2. Per‑run peer‑only: 4 peers × 4 runs
  3. Per‑run feedback‑only: 2 feedback types × 4 runs
  4. Collapsed‑across‑runs: 8 peer×feedback + 4 peer‑only + 2 feedback‑only

RSA‑learn scripts now created (paths on share):

Quick links (HTML key)

<a id="execution-checklist"></a> Execution checklist (pilot subject + verification)

AFNI timing interpretation fix (run‑wise files):

Standardized run‑wise proc + GLM pipeline (new): Script: /Volumes/Jarcho_DataShare/projects/STUDIES/LEARN/fMRI/RSA-learn/scripts/LEARN_run_RSA_runwise_pipeline.sh Purpose: loops subjects to (1) generate proc scripts, (2) clean output dirs (skips running jobs), (3) run GLM from correct working dir. Usage: bash /data/projects/STUDIES/LEARN/fMRI/RSA-learn/scripts/LEARN_run_RSA_runwise_pipeline.sh Discovery: auto‑detects subjects from RSA-learn/TimingFiles/Full/sub-* (fallback: bids/sub-*) Parallel: MAX_JOBS=4 (adjust concurrency) Override root: SUBJ_ROOT=/path/to/sub-* Toggles: MAKE_PROC=0 or CLEAN_OUT=0 or RUN_GLM=0 to skip steps. Git-tracked copies (repo): /Users/dannyzweben/Desktop/SDN/Y1_project/rsa-learn/scripts/

Timing files note (separate step): Run‑wise timing files are generated from BIDS events. After the nopred_fdbk relabeling fix (below), regenerate timing from the corrected BIDS tree using BIDS_DIR_OVERRIDE and TIMING_ROOT_OVERRIDE.

<a id="single-subject-trial"></a> Example: single-subject trial (what we did for 1055)

# Proc generation for one subject (make a one-off ap script)
AP_ORIG=/data/projects/STUDIES/LEARN/fMRI/RSA-learn/scripts/LEARN_ap_Full_RSA_runwise.sh
AP_TMP=/data/projects/STUDIES/LEARN/fMRI/RSA-learn/tmp/LEARN_ap_Full_RSA_runwise_1055.sh
cp "$AP_ORIG" "$AP_TMP"
sed -i "s|^set subjects = .*|set subjects = ( 1055 )|" "$AP_TMP"
tcsh "$AP_TMP"

# Clean outputs that can trigger "already exists"
rm -rf /data/projects/STUDIES/LEARN/fMRI/RSA-learn/scripts/1055.results.LEARN_RSA_runwise
rm -rf /data/projects/STUDIES/LEARN/fMRI/RSA-learn/derivatives/afni/IndvlLvlAnalyses/1055/1055.results.LEARN_RSA_runwise

# Run GLM from correct working directory
cd /data/projects/STUDIES/LEARN/fMRI/RSA-learn/derivatives/afni/IndvlLvlAnalyses/1055
tcsh -xef proc.1055.LEARN_RSA_runwise |& tee output.proc.1055.LEARN_RSA_runwise

Standardized loop (all subjects, timing already generated)

# Auto-discover subjects from TimingFiles/Full/sub-*
bash /data/projects/STUDIES/LEARN/fMRI/RSA-learn/scripts/LEARN_run_RSA_runwise_pipeline.sh

# Parallelize (example)
MAX_JOBS=4 bash /data/projects/STUDIES/LEARN/fMRI/RSA-learn/scripts/LEARN_run_RSA_runwise_pipeline.sh

# Override discovery root (if needed)
SUBJ_ROOT=/data/projects/STUDIES/LEARN/fMRI/bids \
  bash /data/projects/STUDIES/LEARN/fMRI/RSA-learn/scripts/LEARN_run_RSA_runwise_pipeline.sh

<a id="exact-commands"></a> Exact commands used (trial + full run)

# Hardware check (server)
nproc

# Timing for all subjects (already run once; uses default subjList_LEARN.txt inside script)
bash /data/projects/STUDIES/LEARN/fMRI/RSA-learn/scripts/LEARN_1D_AFNItiming_Full_RSA_runwise.sh

# Proc generation for 1055 only (trial)
AP_ORIG=/data/projects/STUDIES/LEARN/fMRI/RSA-learn/scripts/LEARN_ap_Full_RSA_runwise.sh
AP_TMP=/data/projects/STUDIES/LEARN/fMRI/RSA-learn/tmp/LEARN_ap_Full_RSA_runwise_1055.sh
cp "$AP_ORIG" "$AP_TMP"
sed -i "s|^set subjects = .*|set subjects = ( 1055 )|" "$AP_TMP"
tcsh "$AP_TMP"

# Clean stale outputs that cause "already exists"
rm -rf /data/projects/STUDIES/LEARN/fMRI/RSA-learn/scripts/1055.results.LEARN_RSA_runwise
rm -rf /data/projects/STUDIES/LEARN/fMRI/RSA-learn/derivatives/afni/IndvlLvlAnalyses/1055/1055.results.LEARN_RSA_runwise

# GLM run for 1055 (run from results dir to avoid relative output issues)
cd /data/projects/STUDIES/LEARN/fMRI/RSA-learn/derivatives/afni/IndvlLvlAnalyses/1055
tcsh -xef proc.1055.LEARN_RSA_runwise |& tee output.proc.1055.LEARN_RSA_runwise

# Full cohort run in tmux (proc+clean+GLM; timing already generated)
tmux new -s rsa_all
MAX_JOBS=16 LOAD_LIMIT=20 bash /data/projects/STUDIES/LEARN/fMRI/RSA-learn/scripts/LEARN_run_RSA_runwise_pipeline.sh

nproc context

<a id="undergrad-quickstart"></a> Undergrad quick‑start (the “right first run” script)

# Run timing once (if not already generated)
bash /data/projects/STUDIES/LEARN/fMRI/RSA-learn/scripts/LEARN_1D_AFNItiming_Full_RSA_runwise.sh

# Full cohort GLM in tmux (proc + clean + GLM) — AFNI raw, no smoothing
tmux kill-session -t rsa_afni
tmux new -s rsa_afni \
"SUBJ_ROOT=/data/projects/STUDIES/LEARN/fMRI/RSA-learn/TimingFiles/Fixed2 \
TIMING_ROOT_OVERRIDE=/data/projects/STUDIES/LEARN/fMRI/RSA-learn/TimingFiles/Fixed2 \
BIDS_DIR_OVERRIDE=/data/projects/STUDIES/LEARN/fMRI/bids \
MAKE_PROC=1 CLEAN_OUT=1 RUN_GLM=1 \
MAX_JOBS=16 LOAD_LIMIT=999 \
bash /data/projects/STUDIES/LEARN/fMRI/RSA-learn/scripts/LEARN_run_RSA_runwise_pipeline_afni_raw.sh"

<a id="nopred-fdbk-fix"></a> Event‑label fix (nopred_fdbk → correct feedback)

# Relabel nopred_fdbk using canonical run template
python3 /data/projects/STUDIES/LEARN/fMRI/RSA-learn/scripts/LEARN_fix_nopred_fdbk_by_template.py \
  --bids-dir /data/projects/STUDIES/LEARN/fMRI/RSA-learn/bids_fixed \
  --out-dir  /data/projects/STUDIES/LEARN/fMRI/RSA-learn/bids_fixed2 \
  --report   /data/projects/STUDIES/LEARN/fMRI/RSA-learn/reports/nopred_fdbk_fix_template.tsv \
  --mode majority

# Regenerate timing from fixed BIDS (Fixed2)
# If the server script is hard-coded, patch a temp copy:
TMP=/tmp/LEARN_1D_AFNItiming_Full_RSA_runwise_fixed2.sh
cp /data/projects/STUDIES/LEARN/fMRI/RSA-learn/scripts/LEARN_1D_AFNItiming_Full_RSA_runwise.sh "$TMP"
sed -i 's|BIDS_DIR="[^"]*"|BIDS_DIR="/data/projects/STUDIES/LEARN/fMRI/RSA-learn/bids_fixed2"|' "$TMP"
sed -i 's|TIMING_ROOT="[^"]*"|TIMING_ROOT="/data/projects/STUDIES/LEARN/fMRI/RSA-learn/TimingFiles/Fixed2"|' "$TMP"
bash "$TMP"

<a id="pipeline-basics"></a> Baseline pipeline (timing → proc → GLM/GLT)

  1. Timing files: run‑wise NonPM timing (NonPM_*_runX.1D) generated for each subject.
  2. Proc generation: LEARN_ap_Full_RSA_runwise_AFNI_noblur.sh builds AFNI raw‑BIDS proc scripts using local timing (-local_times) and no blur block (unsmoothed for RSA).
  3. GLM/GLT: run‑wise model fits 14 betas per run (peer, feedback, peer×feedback) and GLTs compute the same 14 contrasts averaged across all available runs (including 2–3 run subjects via dynamic GLTs).

<a id="script-excerpts"></a> Script excerpts (key lines)

# LEARN_1D_AFNItiming_Full_RSA_runwise.sh (timing for all subjects)
SUBJ_LIST="/data/projects/STUDIES/LEARN/fMRI/code/afni/subjList_LEARN.txt"
TIMING_ROOT="/data/projects/STUDIES/LEARN/fMRI/RSA-learn/TimingFiles/Full"
for subj in `cat ${SUBJ_LIST}`; do
  mkdir -p "${TIMING_ROOT}/sub-${subj}"
  # ... NonPM_*_runX.1D creation ...
done

# LEARN_ap_Full_RSA_runwise_AFNI_noblur.sh (AFNI raw, no smoothing)
-regress_opts_3dD \
    -local_times \

# LEARN_run_RSA_runwise_pipeline_afni_raw.sh (standardized loop + 2–3 run fallback)
SUBJ_ROOT="${SUBJ_ROOT:-$TIMING_ROOT}"
find "$SUBJ_ROOT" -maxdepth 1 -type d -name "sub-*"
AP_TMP="$TMP_DIR/LEARN_ap_Full_RSA_runwise_AFNI_${subj}.sh"
sed -i "s|^set subjects = .*|set subjects = ( ${subj} )|" "$AP_TMP"
mapfile -t RUNS < <(find "$BIDS_DIR/sub-${subj}/func" -maxdepth 1 -type f -name "sub-${subj}_task-learn_run-*_bold.nii.gz" \
  | sed -E 's/.*run-0*([0-9]+).*/\\1/' | sort -n)
if [ "$run_count" -lt 2 ]; then
  echo "[RSA-learn] SKIP (runs <2): $subj"
fi
if [ "$run_count" -lt 4 ]; then
  # rewrite AP_TMP to available runs + recompute GLTs
fi
OUT_DIR="$RESULTS_DIR/$subj/${subj}.results.LEARN_RSA_runwise_AFNI"
rm -rf "$OUT_DIR" "$SCRIPT_DIR/${subj}.results.LEARN_RSA_runwise_AFNI"
cd "$RESULTS_DIR/$subj" && tcsh -xef "proc.${subj}.LEARN_RSA_runwise_AFNI" |& tee "output.proc.${subj}.LEARN_RSA_runwise_AFNI"
MAX_JOBS=4 bash /data/projects/STUDIES/LEARN/fMRI/RSA-learn/scripts/LEARN_run_RSA_runwise_pipeline_afni_raw.sh

<a id="glm-go-for-it"></a> One‑off GLM fix (3dDeconvolve collinearity)

# Audit: identify the lone missing subject and confirm 3dDeconvolve warning
RESULTS=/data/projects/STUDIES/LEARN/fMRI/RSA-learn/derivatives/afni/IndvlLvlAnalyses
TIMING=/data/projects/STUDIES/LEARN/fMRI/RSA-learn/TimingFiles/Fixed2
for d in $TIMING/sub-*; do
  id=${d##*sub-}
  stats="$RESULTS/$id/${id}.results.LEARN_RSA_runwise_AFNI/stats.${id}+tlrc.HEAD"
  [ ! -f "$stats" ] && echo "$id"
done | sort -n

id=1522
egrep -n "ERROR|FATAL" "$RESULTS/$id/output.proc.${id}.LEARN_RSA_runwise_AFNI" | tail -n 6
sed -n '1,40p' "$RESULTS/$id/1522.results.LEARN_RSA_runwise_AFNI/3dDeconvolve.err"

# GLM-only rerun with GOFORIT (no re-preprocess)
tmux kill-session -t rsa_1522_glm 2>/dev/null
tmux new -s rsa_1522_glm "
set -e
id=1522
BASE=/data/projects/STUDIES/LEARN/fMRI/RSA-learn/derivatives/afni/IndvlLvlAnalyses/$id
OUT=$BASE/${id}.results.LEARN_RSA_runwise_AFNI
PROC=$BASE/proc.${id}.LEARN_RSA_runwise_AFNI

python3 - <<'PY'
from pathlib import Path
path = Path('$PROC')
lines = path.read_text().splitlines()
if not any('GOFORIT' in l for l in lines):
    out = []
    inserted = False
    for l in lines:
        out.append(l)
        if (not inserted) and '-polort 3 -float' in l:
            out.append('    -GOFORIT 1                                                     \\\\')
            inserted = True
    if not inserted:
        raise SystemExit('GOFORIT insertion point not found')
    path.write_text('\\n'.join(out) + '\\n')
PY

rm -f $OUT/stats.${id}+tlrc.* $OUT/stats.${id}_REML* $OUT/cbucket* $OUT/fitts* $OUT/errts* $OUT/X.* $OUT/3dDeconvolve.err \
      run_3dDeconvolve.body.tcsh run_3dDeconvolve.tcsh

cd $OUT
awk 'BEGIN{p=0} /^3dDeconvolve /{p=1} p{if ($0 ~ /^if \\( \\$status \\)/) {exit} else print}' "$PROC" > run_3dDeconvolve.body.tcsh
{ echo "set subj = 1522"; cat run_3dDeconvolve.body.tcsh; } > run_3dDeconvolve.tcsh
tcsh -xef run_3dDeconvolve.tcsh |& tee 3dDeconvolve.rerun.log
tcsh -xef stats.REML_cmd |& tee 3dREMLfit.rerun.log
"
Example beta map snapshots (sub‑1290) — click to expand

Four peer×feedback betas (one per run), rendered with @chauffeur_afni and saved as PNGs:

ConditionAxialCoronalSagittal
FBM.Mean60.r1
FBN.Mean80.r2
FBM.Nice60.r3
FBN.Nice80.r4

<a id="audit-report"></a> Audit (post‑run) — what we ran, what we learned, what was found

Audit command (run on server)

# RSA‑learn full audit
BASE=/data/projects/STUDIES/LEARN/fMRI/RSA-learn
RESULTS="$BASE/derivatives/afni/IndvlLvlAnalyses"
TIMING="$BASE/TimingFiles/Full"
REPORT="/tmp/rsa_run_audit_$(date +%Y%m%d_%H%M).txt"

{
  echo "=== RSA RUN AUDIT ==="
  echo "Timestamp: $(date)"
  echo "RESULTS: $RESULTS"
  echo "TIMING:  $TIMING"
  echo

  echo "=== SUBJECT DISCOVERY ==="
  SUBJECTS=$(find "$TIMING" -maxdepth 1 -type d -name "sub-*" -printf "%f\n" 2>/dev/null | sed 's/^sub-//' | sort -u)
  echo "Subjects found in timing: $(echo "$SUBJECTS" | sed '/^$/d' | wc -l)"
  echo

  echo "=== OUTPUT COUNTS ==="
  STATS=$(find "$RESULTS" -name "stats.*+tlrc.HEAD" 2>/dev/null)
  echo "Stats HEAD files: $(echo "$STATS" | sed '/^$/d' | wc -l)"
  echo "Output.proc logs: $(find "$RESULTS" -name "output.proc.*" 2>/dev/null | wc -l)"
  echo

  echo "=== MISSING OUTPUTS (by subject) ==="
  for s in $SUBJECTS; do
    stats="$RESULTS/$s/${s}.results.LEARN_RSA_runwise/stats.${s}+tlrc.HEAD"
    log="$RESULTS/$s/output.proc.${s}.LEARN_RSA_runwise"
    if [ ! -f "$stats" ]; then echo "MISSING stats: $s"; fi
    if [ ! -f "$log" ]; then echo "MISSING log:   $s"; fi
  done
  echo

  echo "=== RUN COMPLETION CHECK ==="
  for s in $SUBJECTS; do
    log="$RESULTS/$s/output.proc.${s}.LEARN_RSA_runwise"
    if [ -f "$log" ]; then
      if ! grep -q "execution finished" "$log"; then
        echo "NO FINISH LINE: $s"
      fi
    fi
  done
  echo

  echo "=== HIGH-SEVERITY ERRORS ==="
  grep -H -n -E "\\*\\* (ERROR|FATAL)|ERROR|FATAL|FAILED|ABORT|Segmentation|Segfault|terminate" \
    $RESULTS/*/output.proc.* 2>/dev/null || true
  echo

  echo "=== FILE/PATH ERRORS ==="
  grep -H -n -E "No such file|not found|missing|cannot|cannot open|failed to open|Permission denied" \
    $RESULTS/*/output.proc.* 2>/dev/null || true
  echo

  echo "=== QC WARNINGS (non-fatal) ==="
  grep -H -n -E "failed to find volreg dset|failed to find motion enorm dset|failed to init basics" \
    $RESULTS/*/output.proc.* 2>/dev/null || true
  echo

  echo "=== TIMING FORMAT CHECKS ==="
  grep -H -n -E "local_times|rows does not match" \
    $RESULTS/*/output.proc.* 2>/dev/null || true
  echo
} | tee "$REPORT"

echo "Report saved to $REPORT"

How we learned it

Findings (from that audit)

Follow‑up audit (standard AFNI vs RSA runwise)

Targeted rerun script (no subject list) Script: /Volumes/Jarcho_DataShare/projects/STUDIES/LEARN/fMRI/RSA-learn/scripts/LEARN_run_RSA_runwise_rerun_from_standard.sh
Behavior:

  1. Finds subjects with standard AFNI stats but missing RSA runwise stats
  2. Skips subjects with missing timing/confounds or <2 fMRIPrep runs
  3. If a subject has 2–3 runs, rewrites afni_proc inputs to those runs and recomputes GLTs over available runs
  4. Runs proc + clean + GLM for the remaining set
  5. Logs skip reasons to RSA-learn/logs/rerun_missing_YYYYMMDD_HHMM.log

Targeted rerun (tmux)

tmux new -s rsa_rerun
MAX_JOBS=16 LOAD_LIMIT=20 \
  bash /data/projects/STUDIES/LEARN/fMRI/RSA-learn/scripts/LEARN_run_RSA_runwise_rerun_from_standard.sh
  1. Generate RSA‑learn timing files (run‑wise NonPM):
    • Script: /Users/dannyzweben/Desktop/SDN/Y1_project/fmri-data/LEARN_share/RSA-learn/scripts/LEARN_1D_AFNItiming_Full_RSA_runwise.sh
    • Expect: RSA-learn/TimingFiles/Full/sub-<ID>/NonPM_*_runX.1D
  2. Generate afni_proc scripts (no execution yet):
    • Script: /Users/dannyzweben/Desktop/SDN/Y1_project/fmri-data/LEARN_share/RSA-learn/scripts/LEARN_ap_Full_RSA_runwise.sh
    • Expect: RSA-learn/derivatives/afni/IndvlLvlAnalyses/<ID>/proc.<ID>.LEARN_RSA_runwise
  3. Pilot run 1 subject (server):
    • Wrapper: /Users/dannyzweben/Desktop/SDN/Y1_project/fmri-data/LEARN_share/RSA-learn/scripts/LEARN_RunAFNIProc_RSA_runwise.sh
    • Edit subject list to a single ID for timing + execution.
  4. Verify outputs (after pilot finishes):
    • Check stats bucket exists:
      • stats.<ID>+tlrc.HEAD and stats.<ID>+tlrc.BRIK.gz
    • Check run‑wise labels:
      • 3dinfo -label stats.<ID>+tlrc.HEAD | tr '~' ' ' | grep -E 'FBM.Mean60.r1|FBN.Mean60.r1|FBM.Nice80.r4'
    • Check GLT labels:
      • 3dinfo -label stats.<ID>+tlrc.HEAD | tr '~' ' ' | grep -E 'Mean60.r1|FBM.r1|FBM.Mean60.all|FBM.all'

These match the existing LEARN pipeline style and are built to be run on the Linux server (not locally).

Recommended directory layout

RSA-learn/
  scripts/
  derivatives/afni/IndvlLvlAnalyses/
  logs/
  notes/

0) Global Config

# =============================
# CONFIG
# =============================
DATA_DIR = "/path/to/betas"      # update later
ROI_DIR  = "/path/to/rois"       # update later
OUT_DIR  = "/path/to/output"     # update later

SUBJECTS = ["S001", "S002"]
ROIS     = ["vmPFC", "dACC", "ant_ins", "post_ins", "vStriatum"]
PEERS    = ["P1", "P2", "P3", "P4"]
VALENCE  = ["pos", "neg"]
RUNS     = [1,2,3,4]

BETA_FMT = "{subj}_{roi}_{peer}_{val}.nii.gz"          # averaged
RUN_FMT  = "{subj}_{roi}_run{run}_{peer}.nii.gz"       # run-wise
TRIAL_FMT= "{subj}_{roi}_run{run}_trial{trial}.nii.gz" # trial-wise
ROI_FMT  = "{roi}.nii.gz"

1) Manifest + QA

import os, pandas as pd

def build_manifest_avg():
    rows=[]
    for subj in SUBJECTS:
        for roi in ROIS:
            for peer in PEERS:
                for val in VALENCE:
                    path = f"{DATA_DIR}/" + BETA_FMT.format(subj=subj, roi=roi, peer=peer, val=val)
                    rows.append({"subject":subj,"roi":roi,"peer":peer,"valence":val,"beta_path":path,"exists":os.path.exists(path)})
    return pd.DataFrame(rows)

manifest = build_manifest_avg()
missing = manifest[manifest["exists"]==False]
if len(missing)>0:
    print("Missing files:")
    print(missing.head())

2) ROI Extraction + QA

import nibabel as nib
import numpy as np
from nilearn.masking import apply_mask

def extract_roi_vector(beta_path, roi_path):
    beta_img = nib.load(beta_path)
    roi_img = nib.load(roi_path)
    vec = apply_mask(beta_img, roi_img)
    vec[vec==0] = np.nan
    return vec

def roi_voxel_count(roi_path):
    data = nib.load(roi_path).get_fdata()
    return int((data>0).sum())

for roi in ROIS:
    print(roi, roi_voxel_count(f"{ROI_DIR}/"+ROI_FMT.format(roi=roi)))

3) Pattern Matrices

3.1 Peer‑level patterns (4×voxels)


def build_peer_matrix(subj, roi):
    roi_path = f"{ROI_DIR}/" + ROI_FMT.format(roi=roi)
    patterns=[]
    for peer in PEERS:
        vecs=[]
        for val in VALENCE:
            beta_path = f"{DATA_DIR}/" + BETA_FMT.format(subj=subj, roi=roi, peer=peer, val=val)
            vecs.append(extract_roi_vector(beta_path, roi_path))
        patterns.append(np.nanmean(np.vstack(vecs), axis=0))
    return np.vstack(patterns)

3.2 Peer×Feedback patterns (8×voxels)


def build_peer_feedback_matrix(subj, roi):
    roi_path = f"{ROI_DIR}/" + ROI_FMT.format(roi=roi)
    patterns=[]
    for peer in PEERS:
        for val in VALENCE:
            beta_path = f"{DATA_DIR}/" + BETA_FMT.format(subj=subj, roi=roi, peer=peer, val=val)
            patterns.append(extract_roi_vector(beta_path, roi_path))
    return np.vstack(patterns)

3.3 Run‑wise peer patterns (future)

# run-wise peer patterns: 4 peers x voxels for each run

def build_peer_matrix_run(subj, roi, run):
    roi_path = f"{ROI_DIR}/" + ROI_FMT.format(roi=roi)
    patterns=[]
    for peer in PEERS:
        beta_path = f"{DATA_DIR}/" + RUN_FMT.format(subj=subj, roi=roi, run=run, peer=peer)
        patterns.append(extract_roi_vector(beta_path, roi_path))
    return np.vstack(patterns)

3.4 Trial‑wise patterns (future)

# trial-wise beta series: trial x voxels

def build_trial_matrix(subj, roi, run, n_trials):
    roi_path = f"{ROI_DIR}/" + ROI_FMT.format(roi=roi)
    patterns=[]
    for t in range(1, n_trials+1):
        beta_path = f"{DATA_DIR}/" + TRIAL_FMT.format(subj=subj, roi=roi, run=run, trial=t)
        patterns.append(extract_roi_vector(beta_path, roi_path))
    return np.vstack(patterns)

4) Neural RDMs

import numpy as np

def neural_rdm(patterns):
    corr = np.corrcoef(patterns)
    return 1 - corr

5) Model Fit

from scipy.stats import spearmanr

def model_fit(neural_rdm, model_rdm):
    tri = np.tril_indices_from(neural_rdm, k=-1)
    r,_ = spearmanr(neural_rdm[tri], model_rdm[tri])
    return np.arctanh(r)

6) Batch Pipeline (Averaged Betas)

results=[]
for subj in SUBJECTS:
    for roi in ROIS:
        peer_patterns = build_peer_matrix(subj, roi)
        rdm_peer = neural_rdm(peer_patterns)

        fit_disp = model_fit(rdm_peer, rdm_disp)
        fit_pred = model_fit(rdm_peer, rdm_pred)
        fit_combo = model_fit(rdm_peer, rdm_combo)

        pf_patterns = build_peer_feedback_matrix(subj, roi)
        rdm_pf = neural_rdm(pf_patterns)
        fit_fb = model_fit(rdm_pf, rdm_feedback)
        fit_peerfb = model_fit(rdm_pf, rdm_peer_feedback)

        results.append({
            "subject":subj,
            "roi":roi,
            "fit_disp":fit_disp,
            "fit_pred":fit_pred,
            "fit_combo":fit_combo,
            "fit_fb":fit_fb,
            "fit_peerfb":fit_peerfb,
        })

results_df = pd.DataFrame(results)
results_df.to_csv(f"{OUT_DIR}/rsa_model_fits.csv", index=False)

7) Run‑wise Pipeline (Future)

run_results=[]
for subj in SUBJECTS:
    for roi in ROIS:
        for run in RUNS:
            peer_patterns = build_peer_matrix_run(subj, roi, run)
            rdm_peer = neural_rdm(peer_patterns)
            fit_combo = model_fit(rdm_peer, rdm_combo)
            run_results.append({"subject":subj,"roi":roi,"run":run,"fit_combo":fit_combo})

run_df = pd.DataFrame(run_results)
run_df.to_csv(f"{OUT_DIR}/rsa_model_fits_by_run.csv", index=False)

8) Trial‑wise Pipeline (Future)

# Example: build trial-wise RDM and compare to PE-sign model
from scipy.stats import spearmanr

trial_results=[]
for subj in SUBJECTS:
    for roi in ROIS:
        for run in RUNS:
            trial_patterns = build_trial_matrix(subj, roi, run, n_trials=32)  # placeholder
            rdm_trial = neural_rdm(trial_patterns)
            # compare to trial-level model RDM (e.g., PE sign)
            # fit = model_fit(rdm_trial, model_pe_sign)

9) Validation + Diagnostics

9.1 Split‑half reliability

# Example: odd/even trial splits
# r = spearmanr(rdm_odd[tri], rdm_even[tri])[0]

9.2 Permutation testing

def perm_test(neural_rdm, model_rdm, n=1000):
    tri = np.tril_indices_from(neural_rdm, k=-1)
    obs = spearmanr(neural_rdm[tri], model_rdm[tri])[0]
    null=[]
    for _ in range(n):
        perm = np.random.permutation(neural_rdm.shape[0])
        perm_rdm = neural_rdm[np.ix_(perm, perm)]
        null.append(spearmanr(perm_rdm[tri], model_rdm[tri])[0])
    p = (np.sum(np.array(null) >= obs)+1)/(n+1)
    return obs, p

9.3 Noise ceiling

# group_rdm = np.mean(subj_rdms, axis=0)
# ceiling = np.mean([spearmanr(rdm_s[tri], group_rdm[tri])[0] for rdm_s in subj_rdms])

10) Stats + Outputs

import statsmodels.formula.api as smf
# df = results_df.merge(sa_table, on="subject")
# m = smf.mixedlm("fit_combo ~ SA", df, groups=df["subject"]).fit()
# print(m.summary())

summary = results_df.groupby("roi").mean()
summary.to_csv(f"{OUT_DIR}/rsa_summary_by_roi.csv")

11) Visualization (Optional)

import seaborn as sns
import matplotlib.pyplot as plt

sns.heatmap(rdm_peer, square=True, cmap="mako")
plt.title("Peer RDM")
plt.show()

12) Output

Next: Step 6 assembles everything into the final presentation.


PART VI — NEXT STEPS

  1. Replace placeholder paths with real beta + ROI directories.
  2. Add subject table (SA, age, sex, motion).
  3. Run pipeline end‑to‑end.
  4. Populate tables + figures for PI presentation.